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Distributed Information Retrieval: Exploiting a controlled vocabulary to improve 

collection selection and retrieval effectiveness 

James C. French, Allison L. Powell, Fredric Gey, Natalia Perelman 

October 2001 Proceedings of the tenth international conference on Information and 
knowledge management 

Publisher: ACM Press 

Full text available: ^| pdf(1.47 MB) Additional Information: full citation ; abstract , references , index terms 

Vocabulary incompatibilities arise when the terms used to index a document collection are 
largely unknown, or at least not well-known to the users who eventually search the 
collection. No matter how comprehensive or well-structured the indexing vocabulary, it is 
of little use if it is not used effectively in query formulation. This paper demonstrates that 
techniques for mapping user queries into the controlled indexing vocabulary have the 
potential to radically improve document retrieval perform ... 



Lexicons, corpora, and evaluation: Multilingual speech databases at LDC 
John J. Godfrey 

March 1994 Proceedings of the workshop on Human Language Technology HLT '94 
Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(378.89 KB) Additional Information: full citation , abstract , references 

As multilingual products and technology grow in importance, the Linguistic Data 
Consortium (LDC) intends to provide the resources needed for research and development 
activities, especially in telephone-based, small-vocabulary recognition applications; 
language identification research; and large vocabulary continuous speech recognition 
research.The POLYPHONE corpora, a multilingual "database of databases," are specifically 
designed to meet the needs of telephone application development and testin ... 

Expert/consultation system for a retrieval data-base with semantic network of 

concepts 
Peretz Shoval 

May 1981 ACM SIGIR Forum , Proceedings of the 4th annual international ACM 

SIGIR conference on Information storage and retrieval: theoretical issues 
in information retrieval SIGIR '81, volume 16 issue i 

Publisher: ACM Press 

Full text available: ^ pdf(342.56 KB) Additional Information: full citation , abstract , references , citings 
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This paper describes a development and implementation of an expert/consultation system 
for a retrieval data-base, that interfaces between the user and a retrieval system. The 
system's objective is to perform the information consultant's job in assisting a user to 
select the right vocabulary terms for his query. It is particularly useful for a novice user of 
a controlled-vocabulary, index-based retrieval system, who is not familiar with the 
vocabulary and the system Thesaurus. The user will enter ... 



Metadata for integrating speech documents in a text retrieval system 
Ulrike Glavitsch, Peter Schauble, Martin Wechsler 
December 1994 ACM SIGMOD Record, Volume 23 issue 4 

Publisher: ACM Press 

Full text available: ^ pdf(603.39 KB) Additional Information: full citation , abstract , citings , index terms 

We present an information retrieval system that simultaneously allows to search for text 
and speech documents. The retrieval system accepts vague queries and performs a best- 
match search to find those documents that are relevant to the query. The output of the 
retrieval system is a list of ranked documents where the documents on the top of the list 
satisfy best the user's information need. The relevance of the documents is estimated by 
means of metadata (document description vectors). The metada ... 

Speech recognition as a computer graphics input technique (Panel Session) 

Richard Rabin, Alan R. Strass, Mark Robillard, Sue Schedler, Matthew Peterson 

July 1982 ACM SIGGRAPH Computer Graphics , Proceedings of the 9th annual 

conference on Computer graphics and interactive techniques SIGGRAPH 

'82, Volume 16 Issue 3 
Publisher: ACM Press 

Full text available: ^ pdf( 132.25 KB) Additional Information: full citation , abstract , index terms 

Richard Rabin Interactive graphics systems typically require intense "hands busy/eyes 
busy and brains busy" activity on the part of the system user/operator. Voice input by 
means of automatic speech recognition equipment, offers major potential for improving 
user/operator productivity. It is the only input technique which does not require the direct 
use of hands and eyes. Voice input can replace or complement keyboards, function keys, 
tablets and other type ... 



On the interrelationship of dictionary size and completeness Q 
H. Huther 

December 1989 Proceedings of the 13th annual international ACM SIGIR conference 
on Research and development in information retrieval 

Publisher: ACM Press 

Full text available: ^ pdf(897.74 KB) Additional Information: full citation , abstract , references , index terms 

When dictionaries for specific applications or subject fields are derived from a text 
collection, the frequency distribution of the terms in the collection gives information about 
the expected completeness of the dictionary. If only a subset of the terms in the collection 
is to be included in the dictionary, the completeness of the dictionary can be optimized 
with respect to dictionary size. In this paper, formulas for the relationship between the 
frequency distribution of the te ... 

Controlled natural language interfaces (extended abstract): the best of three worlds Q 
Eva-Martin Mueckstein 

March 1985 Proceedings of the 1985 ACM thirteenth annual conference on Computer 

Science 
Publisher: ACM Press 

Full text available: ^ pdf(267.73 KB) Additional Information: full citation , abstract , citings , index terms 
This paper will discuss the problem of designing user-friendly interfaces for computer 
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applications. In particular, we will describe an interface that is based on mapping formal 
into natural languages in a controlled and structured way. The basic approaches for 
designing interfaces range from formal or natural language to menu driven ones. Formal 
language interfaces such as query or programming languages are typically powerful in 
terms of their manipulative capabilities, safe in ... 

8 Lexicons, corpora, and evaluation: The hub and spoke paradigm for CSR evaluation Q 
Francis Kubala, Jerome Bellegarda, Jordan Cohen, David Pallett, Doug Paul, Mike Phillips, 
Raja Rajasekaran, Fred Richardson, Michael Riley, Roni Rosenfeld, Bob Roth, Mitch Weintraub 
March 1994 Proceedings of the workshop on Human Language Technology HLT '94 
Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(609.61 KB) Additional Information: full citation , abstract , references 

In this paper, we introduce the new paradigm used in the most recent ARPA-sponsored 
Continuous Speech Recognition (CSR) evaluation and then discuss the important features 
of the test design.The 1993 CSR evaluation was organized in a novel fashion in an attempt 
to accomodate research over a broad variety of important problems in CSR while 
maintaining a clear program-wide research focus. Furthermore, each test component in 
the evaluation was designed as an experiment to extract as much information ... 



9 The role of the computer in ethnographic analysis (abstract only) 
Paul Beynon Davies 

May 1981 ACM SIGSOC Bulletin , Proceedings of the joint conference on Easier and 
more productive use of computer systems. (Part - I): Information 
processing in the social sciences and humanities - Volume 1981, Volume 12- 

13 Issue 4-1 
Publisher: ACM Press 

Additional Information: full citation , abstract , index terms 

Art and architecture literature presents indexing difficulties due to the absence of a 
recognized controlled vocabulary. A recent investigation showed a number of independent 
partial efforts targeted to local needs. The Art and Architecture Thesaurus (AAT) group is 
building on the experience of others to create a unified, hierarchical thesaurus for these 
fields. Although the thesaurus itself will be in machine "readable form, the real value of 
automation will be the ability to search hierarchica ... 




10 A muscle model for animation three-dimensional facial expression 
Keith Waters 

August 1987 ACM SIGGRAPH Computer Graphics , Proceedings of the 14th annual 

conference on Computer graphics and interactive techniques SIGGRAPH 

'87, Volume 21 Issue 4 
Publisher: ACM Press 

Full text available* "fH pdf(995 74 KB) Addit ' ona ' Information: full citation , abstract , references , citings , index 
i^h*— 1 : terms 

The development of a parameterized facial muscle process, that incorporates the use of a 
model to create realistic facial animation is described. Existing methods of facial 
parameterization have the inherent problem of hard-wiring performable actions. The 
development of a muscle process that is controllable by a limited number of parameters 
and is non-specific to facial topology allows a richer vocabulary and a more general 
approach to the modelling of the primary facial expressions.A brief discu ... 



11 Second language acquisition and CS1 
Anne Gates Applin 

February 2001 ACM SIGCSE Bulletin , Proceedings of the thirty-second SIGCSE 

technical symposium on Computer Science Education SIGCSE '01, 

Volume 33 Issue 1 
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Publisher: ACM Press 

Full text available* fi3 pdf(431 39 KB) Additional Information: full citation , abstract , references , citings , index 
1 terms 

This paper presents an empirical study of the relative effectiveness of two teaching 
methods used in CS1 classrooms. While the teaching methods are nothing new, the results 
of the study are an important contribution to the body of computer science education 
literature. The research design should also be of interest in that it demonstrates how 
statistical significance can be achieved with a relatively small sample by using the 
naturally occurring groups that we have as course sections.The teachin ... 

12 Automatic indexing and Government-Binding Theory Q 
Robert J. Kuhns 

August 1990 Proceedings of the 13th conference on Computational linguistics - 
Volume 3 

Publisher: Association for Computational Linguistics 

Full text available: ^ pdf(280.65 KB) Additional Information: full citation , abstract , references 

This project note describes a system that receives, parses, indexes, and routes news 
reports. The core of this automatic indexer is a parser based on Government-Binding 
Theory which derives thematic and binding relationships of arguments of the sentences of 
stories. These syntactic structures are interpreted by a semantic processor which is linked 
to conceptual representations of terms from a controlled indexing vocabulary. As a result, 
the system is capable of indexing news with respect to a la ... 

13 Machine translation: Investigating the possibility of a microprocessor-based machine Q 
translation system 

Harold L. Somers 

February 1983 Proceedings of the first conference on Applied natural language 
processing 

Publisher: Association for Computational Linguistics 

Full text available: 1 j| pdf(550.13 KB) 

jSf Additional Information: full citation , abstract , references 

W Publisher Site 

This paper describes an on-going research project being carried out by staff and students 
at the Centre for Computational Linguistics to examine the feasibility of Machine 
Translation (MT) in a microprocessor environment. The system incorporates as far as 
possible features of large-scale MT systems that have proved desirable or effective: it is 
multilingual, algorithms and data are strictly separated, and the system is highly modular. 
Problems of terminological polysemy and syntactic complexity ... 

14 Large-scale controlled vocabulary indexing for named entities Q 
Mark Wasson 

April 2000 Proceedings of the sixth conference on Applied natural language 

processing 
Publisher: Morgan Kaufmann Publishers Inc. 

Full text available: t 9 pdf(530.63 KB) 

sp Additional Information: full citation , abstract , references 

W Publisher Site 

A large-scale controlled vocabulary indexing system is described. The system currently 
covers almost 70, 000 named entity topics, and applies to documents from thousands of 
news publications. Topic definitions are built through substantially automated knowledge 
engineering. 

15 Using Kohonen maps to determine document similarity Q 
Jennifer Farkas 
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October 1994 Proceedings of the 1994 conference of the Centre for Advanced Studies 

on Collaborative research 
Publisher: IBM Press 

Full text available: ^ pdf(575.72 KB) Additional Information: full citation , abstract , references , index terms 

In this paper we present some experimental results on the classification of natural 
language documents using Kohonen's self-organizing-map neural network paradigm. We 
discuss, in particular, how the classification accuracy can be improved if the standard 
keyword representation of documents is enhanced by including specific weights, 
thesaurally-defined relations among keywords, and additional synonyms for keywords. We 
sketch the main features of a prototype of an automatic document classification ... 

16 Retrieval test evaluation of a rule based automatic indexing (AIR/PHYS) | 
N. Fuhr, G. E. Knorz 

July 1984 Proceedings of the 7th annual international ACM SIGIR conference on 

Research and development in information retrieval 
Publisher: British Computer Society 

Full text available: pdf(1.06 MB) Additional Information: full citation , abstract , references 

The automatic indexing system AIR/PHYS and its evaluation by means of a retrieval test 
with 309 requests and 15,000 documents is described. First, the underlying conception of 
a rule based approach is given which is suited to the task of a controlled-vocabulary 
indexing of even large subject fields. Preconditions, performance and results of the 
retrieval test are described, including first results of retrieval runs with weighted automatic 
indexing. 

17 Session 7: Visual and spatial communication and task organization using the visual ' 
knowledge builder 

Frank Shipman, Robert Airhart, Haowei Hsieh, Preetam Maloor, J. Michael Moore, Divya Shah 
September 2001 Proceedings of the 2001 International ACM SIGGROUP Conference on 

Supporting Group Work 
Publisher: ACM Press 

Full text available* HI odfd 21 MB) Additional Information: full citation, abstract , references , citings , index 
' terms 

When people share a workspace, they naturally create visual structures which organize 
resources, communicate interpretations, and coordinate activities. To support this mode of 
communication and coordination we have built the Visual Knowledge Builder (VKB.) VKB 
supports the incremental visual interpretation of information. Through the emergence and 
evolution of visual languages, communication between VKB users sharing a workspace 
grows over time. VKB has been used for two years in note taking, w ... 

Keywords: collaborative writing, emergent structure, information workspace, magnetic 
poetry, navigable history, spatial parser, visual communication, visual language 



18 JurisConsulto: retrieval in jurisprudencial text bases using juridical terminology 
Tania C. D'Agostini Bueno, Christiane Gresse von Wangenheim, Eduardo da Silva Mattos, 
Hugo Cesar Hoeschl, Ricardo M. Barcia 

June 1999 Proceedings of the 7th international conference on Artificial intelligence 
and law 

Publisher: ACM Press 

Full text available: fg) pdf(982.29 KB) Additional Information: full citation , abstract , references , citings , index 

! terms 

In the legal domain, jurisprudence has an important role as a juridical source; its decisions 
support the application of the Law to a concrete case. The problem is that Brazilian Courts 
produce an enormous amount of decisions every year, turning these text sources larger 



http://portal.acm.org/results.cfm?CFID=76311368&CFTOKEN=5024170^ 5/20/06 



Results (page 1): +abstract:controlled +abstract:vocabulary abstract-category 



Page 6 of 6 



every time and forcing juridical professionals to spend more time in the search for a 
relevant decision. Sophisticated AI techniques are needed to minimize searches time and 
improves the quality and appropriateness of the r ... 

19 In search of heuristics for keyword detection (abstract only): my source of discontent Q 
Amos O. Olagunju 

February 1987 Proceedings of the 15th annual conference on Computer Science 
Publisher: ACM Press 

Full text available: ^ £|pdf( 136.74 KB) Additional Information: full citation , abstract , references , index terms 

The question of how to index documents is a central problem in document retrieval. The 
indexing problem can be stated as follows. There exists a large document collection, 
together with population of retrieval system customers, each of whom wants information 
that he thinks might be supplied by documents in the collection. How should the 
documents in the collection be identified ("indexed," "cataloged," etc.) so that the 
collection can be searched to the maximal colle ... 

20 A fuzzy approach to faceted classification and retrieval of reusable software Q 
components 

Francesco Baruchelli, Giancarlo Sued 
June 1997 ACM SIGAPP Applied Computing Review, Volume 5 issue 1 
Publisher: ACM Press 

Full text available: pdf(346.24 KB) Additional Information: full citation , abstract , index terms 

Software reuse can be very useful in increasing the productivity and the quality level of a 
company, but appropriate classification and retrieval tools have to be provided in order to 
exploit its pros. The classic classification and retrieval methods based upon a controlled, 
fixed vocabulary are not very flexible and can be unsatisfactory in many cases, especially 
when the specification are not fully defined. An improvement in this sense can be obtained 
applying some simple fuzzy concepts to the ... 
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